arguments of verbs in Japanese, given that the number of can- 
didate antecedents exceeds that of the zero anaphors. Their 
use of functional notions, viz. topicality and empathy, natu- 
rally extend the role preferences of the underlying centering 
model, but unlike our case, no structural ambiguities are in- 
volved. In contrast, our approach applies to the representation 
of both types of ambiguities. 

We motivated the treatment of (structural) ambiguities 
within the centering framework as a consequence of assum- 
ing an incremental mode of anaphor resolution, a topic that 
has not been raised in the centering literature so far. This is 
surprising insofar as even psycholinguistic studies on center- 
ing (Gordon et al., 1993; Brennan, 1995) do not touch upon 
this issue, though the immediacy of anaphor resolution is a 
common theme in cognitive text processing studies (Just & 
Carpenter, 1987; Sanford & Garrod, 1989). 

Our proposal, based on a dependency-style grammar 
model (Hahn et al., 1994), claims to integrate both the 
sentence-level as well as text-level of anaphora analysis. 
Furthermore, it is also fully integrated with terminological 
reasoning facilities as needed for in-depth text understand- 
ing, and is based on an incremental, single-pass procedure. 
Thus, it is superior to the work on binding theory as devel- 
oped within the GB framework (Chomsky, 1981) that is re- 
stricted to the sentence-level of analysis; just recently, how- 
ever, Merlo (1993) has proposed an incremental procedure 
for computing intrasentential coreferences based on binding 
theory constraints. Also Haddock (1987) considers an incre- 
mental mode of anaphora resolution which boils down to a 
variable binding, i.e., a constraint satisfaction problem in the 
context of a Combinatory Categorial Grammar. Any of these 
approaches neglects the important aspect of a preference scal- 
ing for properly selecting among several candidate discourse 
units as antecedents. This drawback in the same way applies 
to the framework of DRT (Kamp & Reyle, 1993), which is 
also non-incremental. 

Conclusions 

Our approach to anaphora resolution extends the original 
centering model by embedding the centering approach into 
an incremental, single-pass processing model, by providing 
data structures for the centering algorithm which allow for 
the treatment of local and global (parsing) ambiguities, and 
by homogeneously integrating the resolution of sentence- 
level (intrasentential) as well as text-level (intersentential) 
anaphora based on the strict requirements set up by the bind- 
ing criteria (adapted to a dependency grammar framework). 

The anaphora resolution module has been realized as part 
of a dependency parser for the German language. The parser 
has been implemented in Actalk (Briot, 1989), an actor lan- 
guage dialect of Smalltalk. The current lexicon contains 
nearly 3.000 lexical entries and corresponding concept de- 
scriptions from two domains (information technology and 
medicine) available from the LOOM knowledge representa- 
tion system (MacGregor & Bates, 1987). 
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Figure 1 : Protocol for the Local Ambiguity Case at the Text Level 




Figure 2: Protocol for the Local Ambiguity Case at the Phrase Level 



(cf. Table 2). Hence, the local ambiguity with respect to "sie " 
no longer persists and has been reduced to a global one. 

Summarizing, we propose a two-level representation of 
structural ambiguities for the centering model, one at which 
local and global structural ambiguities are made explicit. 
Global ambiguities are represented as sets of forward-looking 
centers (a so-called centering set, in the underlying imple- 
mentation realized as CenterActor), while local ambiguities 
are represented as a set of such centering sets (in the under- 
lying implementation realized as CenteringActor). The cre- 
ation and management of these sets is under control of the 
parser, while the management of entities within these centers 
remains in the realm of the centering theory. The proposal we 
make does not depend on any choice of the underlying gram- 
mar or semantic theory (although binding criteria should be 
expressable). 



Related Work 

The centering model, from its inception (Grosz et al., 1983) 
to its most recent formulation (Grosz et al., 1995), has been 
considered a methodological framework for anaphora resolu- 
tion. With the exception of Brennan et al. (1987), whose 
implementation was interfaced with a concrete HPSG sys- 
tem, the centering approach seems to have been developed as 
a stand-alone theory vehicle, with almost no attention given 
to its integration into a larger NLU system framework. This 
might explain why the issue of structural ambiguity han- 
dling has been largely ignored in the centering framework. 
The problem of referential ambiguity to which our proposal 
is equally applicable has recently been discussed by Walker 
et al. (1994). However, their problem concerns the choice 
options arising for the assignment of alternative discourse en- 
tities from the forward-looking center list to zero-anaphorized 



and a similar syntactic head-modifier compatibility test 4 . We 
now describe the main threads of the algorithm for local am- 
biguity management during anaphora resolution (cf. Fig. 1). 

Consider, e.g., sentence (2) of the already introduced text 
fragment. An attachment of the definite NP "diese Fest- 
platte" (this hard disk) at its prospective head "erreicht" 
(compares to) is tried. Since the NP is ambiguous with re- 
spect to case, a local ambiguity accounting for the subject 
and object reading is created 5 . 

I, la: Two different SearchNomAntecedent messages with 
the argument theAttachment (the dependency relation be- 
tween the NP and "erreicht", i.e., either subject or ob- 
ject) are sent simulataneously from the Anaphor to the 
PhraseActor. 

2-4, 2a-4a: Both messages are forwarded from the 
PhraseActor to the Container Actor ; the ParserActor, and 
the CenteringActor of the preceding sentence. 

5, 5a: It is crucial that for every SearchAntecedent message 
which reaches the CenteringActor, that actor is copied, 
leaving the master copy unchanged. This guarantees that 
each locally ambiguous phrase which contains an anaphor 
manipulates its own centering data structures. 

6, 6a: The messages are distributed to all CenterActors 
where the argument theAttachment is copied in order to 
provide for consistent data in a distributed, concurrent en- 
vironment (the copy action is of relevance only in those 
cases where global and local ambiguities are interleaved). 

7, 7a: NomAnaphorTest succeeds in both cases (the most 
preferred element of the Cf of sentence (1), viz. LPS-105 
fulfills the required conceptual subsumption condition rel- 
ative to "Festplatte" (hard disk)). 

8, 8a: Hence two AntecedentFound messages are sent to each 
corresponding anaphor. 

9, 9a: The semantic predicate permit 6 succeeds with respect 
to the word actors LPS 105 (Festplatte) and erreicht for 
both argument positions, thus the dependency relations the- 
Attachment, viz. subject and object, are confirmed. 

10, 10a: The resolved anaphors send an AnaphorSucceed 
message to the corresponding CenterActors. 

II, 11a: At these CenterActors the determined antecedents 
(LPS-105) are removed from the corresponding Cf lists. 
The removal of the antecedent from the Cf list prevents 

4 We strictly separate the search for the proper antecedent and the 
evaluation of its conceptual compatibility as the modifier of a head. 

5 The bold numbers in the text and the edge numbers in Fig. 1 
and 2 refer to the same computation steps. The index a indicates 
parallel distribution of messages. The directed edges in both figures 
illustrate the basic flow of control caused by the message passing. 

e 'permit accounts for type and further conceptual admissibility 
constraints (number restrictions, etc.). 



it from being (incorrectly) reused as a possible antecedent 
for yet another anaphor within the same sentence. 

As a special case of local ambiguities, consider the sec- 
ond anaphor in sentence (2), the pronoun "sie" (it). Caused 
by the attachment of "sie" to its prospective head "erzielt" 
(scores) in the subordinate clause of sentence (2), and due to 
the case ambiguity of "sie ", viz. nominative and accusative, 
the corresponding PhraseActors are duplicated (as the ma- 
trix clause is ambiguous, too, four interpretations must be 
considered; in the corresponding Fig. 2 only two readings 
are shown). Four SearchAntecedent messages are triggered. 
Steps 1-8 are performed as described above. As the corre- 
sponding CfS of (1) contain only PERFORMANCE (LPS-105 
has already been consumed as a result of previously resolving 
"diese Festplatte "), the predicate permit fails with respect to 
Performance and "erzielt" (Step 9). 

12, 12a: The anaphor sends an AnaphorReject message to 
the CenterActor. 

13, 13a: The Cf list is exhausted. 

14, 14a: Hence, the mechanism for intrasentential anaphora 
resolution is triggered. The search for an antecedent is 
performed within the PhraseActor which contains both the 
anaphor and the antecedent (cf. Fig. 2; it differs from Fig. 
1 mainly with respect to description of the the dependency 
structures within each PhraseActor, which are depicted in 
greater detail). 

15, 15a: Each SearchAntecedent message is forwarded from 
its initiator "sie" to its prospective head Headl "erzielt" 
(scores) which d-binds the initiator. 

16, 16a: Next the message is forwarded to Head2 "erreicht" 
(compares to). 

17, 17a: Then the message is forwarded to possible An- 
tecedents which are modifiers of Headl, where "diese 
Festplatte " ( this hard disk) and "ST-3144 " are reached, re- 
spectively. 

18, 18a: PronAnaphorTest and the semantic predicate permit 
succeed ("diese Festplatte", which is resolved to LPS- 
105, and "erzielt" as well as ST-3144 and "erzielt" are 
both successfully tested by permit). 

19, 19a: An AntecedentFound message is sent to the 
anaphor. The dependency relation theAttachment, viz. sub- 
ject, is confirmed in both cases between "erzielt" (scores) 
and LPS-105 as well as ST-3144, respectively. Similarly, 
the object dependency relation ist established, until the ac- 
cusative phrase "den zweiten Platz" (second-best in this 
category) invalidates this local ambiguity. 

Upon completion of the analysis of sentence (2) two Cen- 
terActors continue to exist with corresponding Cb/Cf data 



& Hahn (1995) for an elaborated discussion of d-binding cri- 
teria). These constraints hold for intra- as well as intersen- 
tential anaphora, thus seamlessly incorporating the discourse 
level of grammatical description (for a comprehensive survey 
of the grammar formalism, cf. Hahn et al. (1994)). 3 

The possible antecedents that can be reached via anaphoric 
relations, irrespective of whether they occur within the 
current sentence or beyond, are described by isPoten- 
tialAnaphoricAntecedentOf '(cf. Table 4), which incorporates 
d-binds (cf. Table 3). 



x d-binds y :<S4> 
(x head" 1 " y) 

A -i3 z: ((x head" 1 " z) A (z head" 1 " y) 
A (zisac* finiteVerb 
V 3u: (z head u 

A ((z spec u A u isac* DetPossessive) 

V (z saxGen u A u isac* Noun) 

V (z ppAtt u A u isac* Noun) 

V (z genAtt u A u isac* Noun))))) 



Table 3: D-binding Constraint 



x isPotentialAnaphoricAntecedentOf y :<S4> 
-i3 z: (z d-binds x A z d-binds y) 
A ((3 u: u d-binds y A u head" 1 " x) — > x left" 1 " y) 



Table 4: Constraint on Potential Anaphoric Antecedents 

PronAnaphorTest from Table 5 contains the major gram- 
matical agreement constraint (covering gender, number and 
person) for some anaphoric pronoun and its nominal an- 
tecedent, while NomAnaphorTest from Table 6 captures the 
major conceptual subsumption constraint for the nominal an- 
tecedent and a corresponding anaphoric definite NP. 



c-command, while the DG constraint on anaphora (cf. Table 4) re- 
lates to the major binding principles of GB (Chomsky, 1981). An 
approach to the incremental computation of intrasentential corefer- 
ences based on Chomsky's binding theory is given by Merlo (1993). 

3 For the definitions of the grammatical predicates below, the fol- 
lowing conventions hold: isac denotes the subclass relation among 
lexical classes (parts of speech), U the unification operation, _L the 
inconsistent element. Let u be a complex feature term and / a feature 
name, then the extraction u\l yields the value of / in u. If / is defined, 
u\l gives _L in all other cases. Semantic and conceptual knowl- 
edge is represented via a KL-ONE-style classification-based knowl- 
edge representation language (MacGregor & Bates, 1987), with isa f 
denoting the subclass relation among concepts. Furthermore, ob- 
ject.attribute denotes the value of the property attribute at object 
and the symbol self refers to the current lexical item. The grammar 
specification language, in addition, incorporates topological primi- 
tives for relations within dependency trees, such as "x occurs left of 
y" and "x is head of y"; rel + and rel* denote the transitive and tran- 
sitive/reflexive closure of a relation rel, respectively. The following 
dependency relations will be used: specifier-of (spec), Saxon geni- 
tive (saxGen), prepositional and genitival attribute (ppAtt, genAtt). 



PronAnaphorTest (pro, ante):<S4> 
ante isac* Nominal A 
((pro.features\self\agr\gen) 

U (ante.features\self\agr\gen) / _L) A 
((pro. features \self\agr\num) 

U (ante.features\self\agr\num) / _L) A 
((pro.features\self\agr\pers) 

U (ante.features\self\agr\pers) / _L) 



Table 5: Constraint on Pronominal Anaphora 



NomAnaphorTest (defNP, ante):<S> 
ante isac* Nominal A 
((defNP.features \self\agr\num) 

U (ante.features\self\agr\num) / _L) A 
ante. concept isof* defNP.concept 



Table 6: Constraint on Nominal Anaphora 

Resolution of Anaphora 

The actor computation model (Agha & Hewitt, 1987) pro- 
vides the background for the procedural interpretation of lex- 
icalized grammar specifications, as those given in the previ- 
ous section, in terms of so-called word actors. Word actors 
communicate via asynchronous message passing; an actor 
can only send messages to other actors it knows about, its 
so-called acquaintances. The arrival of a message at an actor 
triggers the execution of a method that is composed of gram- 
matical predicates (for a survey, cf. Neuhaus & Hahn (1996)). 

The basic data structures for anaphora resolution are or- 
ganized as acquaintances of specific actors. Besides word 
actors for the lexical level of analysis, phrases are encapsu- 
lated in PhraseActors, and one or more PhraseActors which 
cover the same sequence of words but assign different syntac- 
tic interpretations (local ambiguities) to it are encapsulated in 
Container Actors. For every sentence, its associated unique 
ParserActor is acquainted with a CenteringActor which, for 
reasons of ambiguity handling, is acquainted with one or 
more CenterActors. Each of these CenterActors has a pref- 
erentially ordered list of forward-looking centers (Cf) and 
a single backward-looking center (Cb). The usual criteria 
for centering apply at this representation level (Grosz et al., 
1995). We extend this basic model, however, in that we pro- 
vide several instances of CenteringActors to account for local 
ambiguities within an utterance, while different CenterActors 
represent global ambiguities of single utterances. Hence, un- 
ambiguous centering is a special case, where a single Cen- 
teringActor is only acquainted with a single Center Actor. 

Anaphora analysis encompasses the procedural interpreta- 
tion of the declarative constraints given in the previous sec- 
tion. For pronominal anaphors, the SearchPronAntecedent 
message is triggered by the successful syntactic test that the 
pronoun may be modifier of its head. For nominal anaphors, 
the SearchNomAntecedent message is triggered by the attach- 
ment of a definite determiner as a modifier to its head noun 



highly ranked element of the current forward-looking centers 
or not. The theory claims, above all, that to the extent a dis- 
course adheres to all these centering constraints (e.g., realiza- 
tion constraints on pronouns, preferences among types of cen- 
ter transitions), its local coherence will increase and the infer- 
ence load placed upon the hearer will decrease. Therefore, the 
tremendous importance of fleshing out the relevant and most 
restrictive, though still general centering constraints. 

Incremental Centering and Ambiguity 

In this section, we argue for an extension of the centering 
model that accounts for ambiguities generated by the incre- 
mental operation of the parsing component of a text under- 
standing system. Though we also provide for mechanisms 
that deal with global structural ambiguities, we here con- 
centrate on local structural ambiguities in the phrase which 
contains an anaphor. These can be directly attributed to 
the incremental processing mode, where each lexical ele- 
ment is integrated in syntactic structures and semantically 
interpreted as early as possible. As the anaphora resolution 
is also executed incrementally, local syntactic ambiguities 
(which cause different referential entities to emerge at the se- 
mantic/conceptual level of interpretation) must be accessible 
through the data structures of the centering algorithm in order 
to maintain local, alternative center readings. 

Consider the text fragment (( 1 ) - (3)) taken from the corpus 
of product reviews: 

(1) In der Leistung konnte die LPS 105 ebenfalls weitgehend 
iiberzeugen. 

(As far as performance is concerned, the LPS 105 also produced 
rather compelling results.) 

(2) Bei der mittleren Zugriffszeit (16,5 ms) erreicht diese Festplatte 
die Seagate ST-3144, womit sie in dieser Disziplin den zweiten 
Platz erzielt. 

(Regarding the mean access time (16,5 ms) this hard disk com- 
pares to the Seagate ST-3144, by which it scores second-best in 
this category.) 

(3) Auch beim Datendurchsatz erweist sie sich als hochkaratiges Pro- 
dukt. 

(Also, considering data throughput it turns out to be a high-caliber 
product.) 

Sentence (1) has a unique structural analysis, the forward- 
looking centers (Cf) consist of two semantic/conceptual ele- 
ments, the LPS- 105 hard disk and PERFORMANCE (cf. Table 
2). In sentence (2), a nominal anaphor occurs, "diese Fest- 
platte" (this hard disk), which is resolved to LPS -105 from 
the previous sentence. Unfortunately, the noun phrase "diese 
Festplatte" is nominative as well as accusative and may be al- 
ternatively attached to the verb "erreicht" (compares to) both 
in its subject and object role (we assume a dependency gram- 
mar framework as briefly described in the following section). 
In this state, one cannot determine which of the grammatical 
functions is the correct one, thus a structural ambiguity has 
been identified. Since the second NP in this sentence ( "die 
Seagate ST-3144 ") is ambiguous with respect to both of these 



cases, too, the parser produces two structurally and concep- 
tually ambiguous readings (with inverted subject/object in- 
stantiations; given appropriate stress marking both readings 
are equally plausible). As a consequence, two different C/s 1 
have to be created (cf. Table 2), which indicate two different 
center transitions, viz. continuation vs. retention, eligible at 
the end of the analysis of the second sentence. This choice op- 
tion becomes crucial for the resolution of the pronoun "sie " 
(it) in sentence (3), as it depends on the appropriate selection 
of one of the two different C/s. In the case of the CONTINUE 
transition (the Cb of the previous utterance is also the highest 
ranked element of the Cj s of the current utterance) LPS - 1 05 
is preferred as the antecendent, while in the case of the RE- 
TAIN transition (the Cb of the previous utterance is not the 
highest ranked element of the C/s of the current utterance) 
it is ST- 3 144. Depending on how the text actually proceeds 
either one is equally possible. So, for the actual anaphora res- 
olution the transition type preferences (Rule 2 in Grosz et al. 
(1995)) are of no help at all to decide among any of these vari- 
ants. We, therefore, conclude that additional representation 
devices have to be supplied to keep track of these structurally 
induced ambiguities at the center level. 



(1) 


Cb: LPS- 105: LPS 105 
Cf: [LPS-105: LPS 105, 

Performance: Leistung] 


CONTINUE 


(2) 


CM: LPS-105: Festplatte 
Cfl: [LPS-105: Festplatte, 

ST-3144: Seagate ST-3144, 

ACCESS-TIME: Zugriffszeit, 

RANK: Platz, 

Category: Disziplin] 


CONTINUE 


Cb2: LPS-105: Festplatte 
Cf2: [ST- 3 144: Seagate ST-3144, 
LPS-105: Festplatte, 
ACCESS-TIME: Zugriffszeit, 
Rank: Platz, 
Category: Disziplin] 


RETAIN 



Table 2: Centering Data for Sentences (1) and (2) 



Grammar Constraints on Anaphora 

We now consider several constraints on anaphora which apply 
both to the sentence-level and text-level of anaphora analysis. 
These descriptions will later serve as a framework for con- 
sidering local ambiguity within the centering approach. We 
here adapt the common binding criteria to the methodologi- 
cal requirements of a fully lexicalized dependency grammar 
(DG), introducing the central notion of d-binding 2 (cf. Strube 

1 To simplify the presentation, we will assume the canonical or- 
dering on Cf based on grammatical roles, viz. SUBJECT > OB- 
JECT^) > OTHERS (Grosz et al., 1995, p.214). We have clear 
evidences, whatsoever, that this is inappropriate, for German and 
related free word order languages at least, and argue for ordering 
criteria based on the functional information structure of utterances 
in terms of topic/comment or theme/rheme patterns in a companion 
paper (Strube & Hahn, 1 996). 

2 The definition of d-binding (cf. Table 3) corresponds to the gov- 
erning category in GB terminology, which relies upon the notion of 
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Abstract 

In this paper, we present a model of anaphor resolution within 
the framework of the centering model. The consideration of 
an incremental processing mode introduces the need to man- 
age structural ambiguity at the center level. Hence, the cen- 
tering framework is further refined to account for local and 
global parsing ambiguities which propagate up to the level of 
center representations, yielding moderately adapted data struc- 
tures for the centering algorithm. 

Introduction 

Psycholinguistic studies have revealed ample evidence for 
the incrementality of human language comprehension, not 
only at the phrasal and clausal level but also at the dis- 
course level of anaphora resolution (Just & Carpenter, 1987; 
Sanford & Garrod, 1989). Correspondingly, incremental pro- 
cessing has also become a major challenge for cognitively 
plausible, computational models of natural language under- 
standing (Jurafsky, 1992; Sturt, 1995), and text understand- 
ing (Granger et al., 1986) in particular. Introducing incre- 
mentality into the centering model (Grosz et al., 1995), the 
methodological framework for our approach to the resolution 
of (pro)nominal anaphora, however, is not at all straightfor- 
ward. In particular, incremental processing introduces (lo- 
cal) ambiguities at significant rates, which cannot be prop- 
erly accounted for at the center level in the original model. 
Though centering strives for the elimination of referential 
ambiguities, the implications of structural ambiguities have 
been completely ignored so far. 

We have gathered some data, summarized in Table 1, 
to give an empirical assessment of the relevance of the is- 
sue under investigation. Altogether 47 texts (product re- 
views from the information technology domain) were ana- 
lyzed which consist of 32291 words, with 230 occurrences of 
(un)ambiguous pronouns. 



ambiguous 
locally 
globally 


174 (76 %) 

145 (63 %) 
29 (13 %) 


unambiguous 


56 (24 %) 



Table 1 : Ambiguity Distribution Patterns of Pronouns 

Given our text corpus, 87% of the sentences could have been 
processed by the original, non-incremental centering algo- 



rithm (only global ambiguities could not). This rate drops 
dramatically to only 24% when we assume an incremental 
operation mode. These data reflect the impact of the local 
ambiguities which are resolved at the sentence level as the 
parse proceeds and thus are not an issue for the original (non- 
incremental) centering algorithm. The latter percentage rate, 
however, gives a realistic picture of the relevance of the prob- 
lem under scrutiny when one opts for a cognitive adequate, 
incremental model of text understanding. 

Brief Survey of the Centering Model 

The framework of our model of anaphora resolution is pro- 
vided by the well-known centering mechanism (Grosz et al., 
1995), for which psycholinguistic evidences are provided by 
Gordon et al. (1993) and Brennan (1995), lacking, however, 
the consideration of incrementality of language processing. 
The theory of centering is intended to model the local coher- 
ence of discourse, i.e., coherence among the utterances Ui in 
a particular discourse segment (say, a paragraph of a text). 
Local coherence is opposed to global coherence, i.e., coher- 
ence with other segments in the discourse. Discourse entities 
serving to link one utterance to other utterances in a particular 
discourse segment are organized in terms of centers. Each ut- 
terance Ui in a discourse segment is assigned a set of forward- 
looking centers, Cf(Ui ), and a unique backward-looking cen- 
ter, Cb(Ui). The forward-looking centers of Ui depend only 
on the expressions that constitute that utterance, previous ut- 
terances provide no constraints on Cf(U{). The elements of 
Cf (Ui) are partially ordered to reflect relative prominence in 
Ui. The most highly ranked element of Cf(Ui) that is real- 
ized in Ui+i (i.e., is associated with an expression that has 
a valid interpretation in the underlying semantic/conceptual 
representation language) is the C'b(Ui + i). The ranking im- 
posed on the elements of the Cf reflects the assumption that 
the most highly ranked element of Cf(Ui) is the most pre- 
ferred antecedent of an anaphoric expression in Ui+i, while 
the remaining elements are (partially) ordered according to 
decreasing preference for establishing referential links. 

The theory of centering, in addition, defines several tran- 
sition relations across pairs of adjacent utterances (e.g., con- 
tinuation, retainment, smooth and rough shift), which differ 
from each other according to the degree by which succes- 
sive backward-looking centers are confirmed or rejected, and, 
if they are confirmed, whether they correspond to the most 
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